Skip to content

Comments

Support remote cache CDC#28437

Open
tyler-french wants to merge 1 commit intobazelbuild:masterfrom
tyler-french:tfrench/chunked-remote-cache
Open

Support remote cache CDC#28437
tyler-french wants to merge 1 commit intobazelbuild:masterfrom
tyler-french:tfrench/chunked-remote-cache

Conversation

@tyler-french
Copy link
Contributor

@tyler-french tyler-french commented Jan 26, 2026

TLDR: This PR enables content-defined chunking (FastCDC) for large uploads/downloads to remote cache, saving ~40% storage, ~50% upload bandwidth, and making builds faster by deduplicating similar artifacts across builds.

RELNOTES[NEW]: Added --experimental_remote_cache_chunking flag to read and write large blobs to/from the remote cache in chunks. Requires server support.

Motivation

Actions like GoLink and CppLink produce very large output files that are often similar between builds. A small source change can cause a cache miss, wasting storage, bandwidth, and time on nearly-identical artifacts.

Content-Defined Chunking (CDC) addresses this by splitting files at content-determined cut points. Because cut points are derived from the file content itself, small changes, even ones that shift bytes around, tend to affect only a few chunks. This makes action outputs effectively incremental: even though the action must re-run, the upload, download, and storage costs shrink dramatically.

Results

Benchmarked across the last 50 commits of the BuildBuddy repo (server and client on the same host):

Scenario Upload Download RPCs Disk Cache Avg Build Time
chunking + disk cache 52.0 GB 0 B 626K 146.6 GB 55s
chunking, no disk cache 49.2 GB 343.2 GB 4.1M 54s
no chunking + disk cache 85.6 GB 0 B 273K 246.5 GB 100s
no chunking, no disk cache 89.7 GB 343.8 GB 2.5M 97s

Key takeaways:

  • ~50% less data uploaded (52 GB vs 90 GB)
  • ~40% smaller disk cache (147 GB vs 247 GB)
  • Download size is mostly unchanged (~0.2% increase) because we don't yet store downloaded chunks in the output base. Using a disk cache is recommended for full benefit; output-base chunk reuse is planned.
  • RPC count increases as expected since requests become smaller and more granular.
  • faster builds (depends on conditions, like cache async, compression, & network speed)

Additional benefits: better load balancing across distributed clusters (fewer long-running RPCs) and more granular retries on unstable networks.

How It Works

Write path:

  1. Check if blob exceeds the chunking threshold.
  2. Run FastCDC to compute chunk boundaries.
  3. Call FindMissingBlobs to identify which chunks the server already has.
  4. Upload only the missing chunks.
  5. Call SpliceBlob to register the blob-to-chunks mapping on the server.

Read path:

  1. Check if blob exceeds the chunking threshold.
  2. Call SplitBlob to get the chunk list for this blob.
  3. Download and reassemble the chunks.

If --disk_cache is enabled, previously downloaded chunks are served locally.

Dependencies

Depends on the APIs added in #28614

@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch from 8a45f14 to dbc1af6 Compare January 27, 2026 04:17
@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch 11 times, most recently from 789ab23 to 4349030 Compare January 28, 2026 06:00
@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch 12 times, most recently from 6e3f676 to e795b34 Compare February 3, 2026 04:40
@tyler-french tyler-french changed the title PROTOTYPE/WIP: support CDC WIP: Support cache chunking Feb 3, 2026
@tyler-french tyler-french changed the title WIP: Support cache chunking WIP: Support cache chunking with FastCDC Feb 3, 2026
@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch from e795b34 to e02742a Compare February 3, 2026 04:56
@tyler-french tyler-french changed the title Support remote cache CDC WIP/DONT REVIEW: Support remote cache CDC Feb 11, 2026
@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch 2 times, most recently from dd3f8e9 to 20a1821 Compare February 11, 2026 23:55
@tyler-french tyler-french changed the title WIP/DONT REVIEW: Support remote cache CDC Support remote cache CDC Feb 11, 2026
@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch from 20a1821 to 32bb5db Compare February 12, 2026 00:08
@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch from 32bb5db to f528d14 Compare February 13, 2026 15:44
@tyler-french tyler-french marked this pull request as ready for review February 13, 2026 15:44
@tyler-french tyler-french requested a review from a team as a code owner February 13, 2026 15:44
Copilot AI review requested due to automatic review settings February 13, 2026 15:44
@github-actions github-actions bot added team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels Feb 13, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces content-defined chunking (CDC) for remote caching, which is a significant feature for improving performance with large artifacts. The implementation across the client, server, and protocol seems well-structured. I've identified a critical memory issue in the remote worker's spliceBlob implementation that could lead to out-of-memory errors when handling large files. Additionally, there's a performance issue on the client-side uploader where files are read twice. Addressing these points will improve the robustness and efficiency of this new feature.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds experimental remote cache content-defined chunking (CDC) support using FastCDC 2020, enabling large blobs to be uploaded/downloaded as reusable chunks via new SplitBlob / SpliceBlob RPCs (requires server capability support).

Changes:

  • Updates REAPI protos to define SplitBlob / SpliceBlob, chunking capabilities/params, and related logging protos.
  • Implements client-side chunking flow (chunking config discovery, chunk upload + splice registration, chunked download + reassembly) gated behind --experimental_remote_cache_chunking.
  • Adds worker-side CAS support for the new RPCs plus unit/integration tests and a JMH benchmark.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
third_party/remoteapis/build/bazel/remote/execution/v2/remote_execution.proto Adds Split/Splice RPCs, chunking capabilities/params, and various REAPI comment fixes.
src/tools/remote/src/main/java/com/google/devtools/build/remote/worker/CasServer.java Implements worker-side SplitBlob / SpliceBlob handling for integration testing.
src/tools/remote/src/main/java/com/google/devtools/build/remote/worker/CapabilitiesServer.java Advertises Split/Splice + FastCDC params in cache capabilities.
src/main/protobuf/remote_execution_log.proto Adds log detail messages for SplitBlob/SpliceBlob and wires them into the RPC details oneof.
src/main/java/com/google/devtools/build/lib/remote/util/DigestUtil.java Adds partial-array digest computation for chunk hashing.
src/main/java/com/google/devtools/build/lib/remote/options/RemoteOptions.java Adds --experimental_remote_cache_chunking flag.
src/main/java/com/google/devtools/build/lib/remote/logging/SplitBlobHandler.java Adds logging handler for SplitBlob calls.
src/main/java/com/google/devtools/build/lib/remote/logging/SpliceBlobHandler.java Adds logging handler for SpliceBlob calls.
src/main/java/com/google/devtools/build/lib/remote/logging/LoggingInterceptor.java Routes SplitBlob/SpliceBlob RPCs to the new logging handlers.
src/main/java/com/google/devtools/build/lib/remote/common/RemoteCacheClient.java Extends cache client interface with optional spliceBlob hook.
src/main/java/com/google/devtools/build/lib/remote/chunking/FastCDCChunker.java Introduces FastCDC-based chunk boundary selection and chunk digesting.
src/main/java/com/google/devtools/build/lib/remote/chunking/ChunkingConfig.java Introduces chunking configuration and server-capability-derived defaults.
src/main/java/com/google/devtools/build/lib/remote/chunking/BUILD Adds BUILD target for the new chunking library.
src/main/java/com/google/devtools/build/lib/remote/RemoteServerCapabilities.java Adds compatibility checks for chunking-related server capabilities.
src/main/java/com/google/devtools/build/lib/remote/RemoteModule.java Discovers chunking config from server capabilities and passes it to the cache client.
src/main/java/com/google/devtools/build/lib/remote/GrpcCacheClient.java Adds client implementations of SplitBlob / SpliceBlob RPC calls.
src/main/java/com/google/devtools/build/lib/remote/CombinedCache.java Integrates chunked upload/download paths into CAS file/blob operations.
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobUploader.java Implements chunked upload: chunk -> find-missing -> upload missing -> splice.
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java Implements chunked download: split -> download chunks -> reassemble.
src/main/java/com/google/devtools/build/lib/remote/BUILD Wires the new chunking sources into the main remote library build.
src/test/java/com/google/devtools/build/lib/remote/chunking/FastCDCChunkerTest.java Unit tests for chunk boundary behavior and digest correctness.
src/test/java/com/google/devtools/build/lib/remote/chunking/ChunkingConfigTest.java Unit tests for defaults and server capability parsing.
src/test/java/com/google/devtools/build/lib/remote/chunking/FastCDCBenchmark.java Adds a JMH benchmark for chunking throughput.
src/test/java/com/google/devtools/build/lib/remote/chunking/BUILD Adds test and benchmark targets for chunking tests.
src/test/java/com/google/devtools/build/lib/remote/ChunkedCacheIntegrationTest.java Integration tests for remote-only chunking behavior via SplitBlob.
src/test/java/com/google/devtools/build/lib/remote/ChunkedDiskCacheIntegrationTest.java Integration tests for chunking with disk cache capturing chunk blobs.
src/test/java/com/google/devtools/build/lib/remote/ChunkedBlobUploaderTest.java Unit tests for chunked uploader behavior (missing-chunk selection and data correctness).
src/test/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloaderTest.java Unit tests for chunked downloader behavior (reassembly ordering and edge cases).
src/test/java/com/google/devtools/build/lib/remote/RemoteSpawnRunnerWithGrpcRemoteExecutorTest.java Updates GrpcCacheClient construction to include the new chunkingConfig parameter.
src/test/java/com/google/devtools/build/lib/remote/GrpcCacheClientTest.java Updates GrpcCacheClient construction to include the new chunkingConfig parameter.
src/test/java/com/google/devtools/build/lib/remote/ByteStreamBuildEventArtifactUploaderTest.java Updates GrpcCacheClient construction to include the new chunkingConfig parameter.
src/test/java/com/google/devtools/build/lib/remote/BUILD Registers new chunking-related integration tests and adds chunking sources/deps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch from aaeb1b9 to 9d508a6 Compare February 21, 2026 19:04
@tyler-french tyler-french force-pushed the tfrench/chunked-remote-cache branch from 9d508a6 to 23b5f14 Compare February 22, 2026 01:41
tyler-french added a commit to buildbuddy-io/buildbuddy that referenced this pull request Feb 23, 2026
This is needed to run `bb print` to see Split/Splice calls, to test
bazelbuild/bazel#28437

BB print just reads the grpc log directly from whats in this file, so
updating this file is sufficient.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-review PR is awaiting review from an assigned reviewer team-Remote-Exec Issues and PRs for the Execution (Remote) team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants